Skip to content

zig: fix impurities for Zig compiler and Zig outputs#214440

Closed
strager wants to merge 1 commit intoNixOS:masterfrom
strager:zig-cpu-impure
Closed

zig: fix impurities for Zig compiler and Zig outputs#214440
strager wants to merge 1 commit intoNixOS:masterfrom
strager:zig-cpu-impure

Conversation

@strager
Copy link
Contributor

@strager strager commented Feb 4, 2023

Description of changes

By default, Zig will use the generate code for the native CPU. This is impure and leads to illegal instruction crashes for users which have different CPUs than the build bots.

Change Zig's default to -mcpu=baseline.

To test, I built on my new AMD Zen 3 machine used qemu to emulate an old Intel Sandy Bridge machine, and observed no crashes with this patch:

$ nix-shell -I ~/Projects/ -p ncdu \
    --run 'qemu-x86_64-static -cpu SandyBridge $(which ncdu)'
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.05 Release Notes (or backporting 22.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

By default, Zig will use the generate code for the native CPU. This is
impure and leads to illegal instruction crashes for users which have
different CPUs than the build bots.

Change Zig's default to -mcpu=baseline.

To test, I built on my new AMD Zen 3 machine used qemu to emulate an old
Intel Sandy Bridge machine, and observed no crashes with this patch:

    $ nix-shell -I ~/Projects/ -p ncdu \
        --run 'qemu-x86_64-static -cpu SandyBridge $(which ncdu)'
@ofborg ofborg bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. labels Feb 4, 2023
Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, upstream here. I'd like to work with you on this and help come to a satisfactory resolution for nixos maintainers as well as our mutual users.

I think cpu-purity.patch as currently implemented is problematic, because it has far-reaching consequences well beyond the scope of making nixos packages use the baseline CPU. It's also a widely used data structure, and it's difficult to be sure this diff will not violate assumptions in many places of the codebase.

Please consider that, while it is crucial for zig to use the baseline CPU when zig is being used by nix to build nix packages for binary distributions, another important use case is nixos users directly using the zig binary provided by nixos to develop their own upstream projects. This patch will cause great confusion and harm for the latter use case.

Looking at the rest of this pull request, to me the explicit uses of -Dcpu=baseline look like the right tool for the job. Is there some reason this solution is unsatisfactory?

P.S. I am available to discuss this topic in #nixos on libera.

runHook preInstall

zig build -Drelease-safe=true -Dcpu=baseline --prefix $out install
zig build -Drelease-safe=true --prefix $out install
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"install" is the default build step and can therefore be omitted, having the same effect as before:

Suggested change
zig build -Drelease-safe=true --prefix $out install
zig build -Drelease-safe=true -Dcpu=baseline --prefix $out

@AndersonTorres
Copy link
Member

Split the commit per expression.

@strager
Copy link
Contributor Author

strager commented Feb 4, 2023

to me the explicit uses of -Dcpu=baseline look like the right tool for the job. Is there some reason this solution is unsatisfactory?

In my opinion, it shouldn't be the job of every package to make the toolchain deterministic. Nix' toolchains should be deterministic by default.

For example, Nix' C compiler wrapper scripts try hard to make the C compiler deterministic, going as far as to strip out -march=native options in build scripts automatically:

# Clear march/mtune=native -- they bring impurity.

I don't know what the policy for Nixpkgs is, but I am following the lead of how the C toolchain seems to work.

@strager
Copy link
Contributor Author

strager commented Feb 4, 2023

I think cpu-purity.patch as currently implemented is problematic, because it has far-reaching consequences well beyond the scope of making nixos packages use the baseline CPU. It's also a widely used data structure, and it's difficult to be sure this diff will not violate assumptions in many places of the codebase.

Okay, that makes sense. What is the best way to configure Zig to be deterministic by default regarding CPU code generation? (I understand there are complications with things like libc detection. I don't know if that's relevant to -mcpu.)

@sagehane
Copy link
Contributor

sagehane commented Feb 4, 2023

Wouldn't it be better to make a custom build step for Zig, used internally within Nixpkgs?

Example:
Adding zig to nativeBuildInputs will add zig build -Dtarget=<whatever> ${zigFlags} --prefix $out or similar to the build step.

Relevant feature that the hypothetical build step can support #185665 (comment):

I think a better way would be accessing the desired target architecture, operating system along with minimum/maximum OS version, and ABI, along with potentially the glibc version, from nix, and then specifying these via the standard -Dtarget option.

@figsoda
Copy link
Member

figsoda commented Feb 4, 2023

Another way to do it would be to have a setup hook for zig to make baseline the default, so this will work automatically for existing uses of zig within nativeBuildInputs, and wouldn't change anything outside the nix context

I'm not familiar enough with zig to know how to do that, just throwing some ideas

@AndersonTorres AndersonTorres marked this pull request as draft February 4, 2023 11:26
@AndersonTorres
Copy link
Member

Given the nature of discussion, I will put this in draft mode.

@AndersonTorres
Copy link
Member

I am against the .patch file because it unconditionally changes the source code, effectively making it a fork. Yes, I am purist about the purity, but it should be reached by other means like config/build flags, env vars or stuff like that.

I believe the upstream developers (hello andrewrk :)) can provide such tooling.

@edrex
Copy link
Contributor

edrex commented Feb 7, 2023

@AndersonTorres from outside this conversation, marking someone else's PR as draft seems like a breach of protocol.

@AndersonTorres
Copy link
Member

There is no other suitable tool to provide a more accurate status. Also it can be easily reverted when desired.

@strager strager marked this pull request as ready for review February 7, 2023 20:16
@strager
Copy link
Contributor Author

strager commented Feb 7, 2023

Draft status reverted as desired.

@AndersonTorres
Copy link
Member

Is it ready to merge? All tests are passing.

@figsoda
Copy link
Member

figsoda commented Feb 7, 2023

No. The current approach has notable drawbacks, as @andrewrk pointed out in #214440 (review)

@strager
Copy link
Contributor Author

strager commented Feb 7, 2023

The current approach has notable drawbacks, as @andrewrk pointed out in #214440 (review)

Can you reiterate them here (perhaps rephrased from the Nix point of view rather than upstream's)? I don't find @andrewrk's arguments compelling enough to block this patch. Maybe I'm misunderstanding something about how Nix should mandate purity.

@edrex
Copy link
Contributor

edrex commented Feb 7, 2023

I think what I'm hearing is:

  • drop / defer the purity patch, so downstream zig developers using nixpkgs#zig get the expected language defaults. If they want purity across baseline cpu variants, they should set the flag
  • it seems desirable to consolidate setting the flag for nixpkgs builds somehow, as we should expect the number of zig built packages in nixpkgs to continue to grow. How are default compiler flags usually set in nixpkgs for other compilers?

@edrex
Copy link
Contributor

edrex commented Feb 7, 2023

It seems like the time to add a default to baseline would be when Zig support is added to the nixpkgs builders.

I'm in favor of just patching in the build argument for each package for now, to get us un-broken.

@figsoda
Copy link
Member

figsoda commented Feb 7, 2023

Can you reiterate them here (perhaps rephrased from the Nix point of view rather than upstream's)?

I agree zig should default to baseline cpu when used as a dependency in nixpkgs, but zig is also used outside the nix sandbox, and the patch changes the default that could be confusing for users and upstream.

How are default compiler flags usually set in nixpkgs for other compilers?

I don't know much about the other compilers, but for rust it is abstracted away with buildRustPackage, and flags are added with things like cargoBuildFlags and cargoTestFlags. For zig, since we don't have something like buildRustPackage, the easiest solution might be to use setup hooks #214440 (comment).

@AndersonTorres
Copy link
Member

So we need to create a buildZigApp?

@winterqt
Copy link
Member

winterqt commented Feb 8, 2023

I think setup hooks are the best option here, creating a buildPhase that runs zig build with the required options. I believe this because most projects use Zig's embedded build system, and since Zig doesn't need things such as dependency management a la Cargo (yet), we can get by with just a setup hook to keep it simple.

What I'm not sure about is how we should handle cases like ncdu, where the flags wouldn't be able to be set automatically due to its use of a Makefile. We could either wrap Zig and make the flags be set conditionally, or we can just specify the flags manually for ncdu as we do now with its ZIG_FLAGS environment variable.

@strager If you'd like assistance with any of this, or have any better ideas based on these, feel free to reach out -- more than happy to help.

@figsoda
Copy link
Member

figsoda commented Feb 8, 2023

threw something together, just an idea and still need more work, feel free to open a new pr or update in this one

--- a/pkgs/development/compilers/zig/0.10.nix
+++ b/pkgs/development/compilers/zig/0.10.nix
@@ -4,6 +4,7 @@
 , cmake
 , coreutils
 , llvmPackages
+, makeWrapper
 , libxml2
 , zlib
 }:
@@ -22,6 +23,7 @@ stdenv.mkDerivation rec {
   nativeBuildInputs = [
     cmake
     llvmPackages.llvm.dev
+    makeWrapper
   ];
 
   buildInputs = [
@@ -49,11 +51,19 @@ stdenv.mkDerivation rec {
     "-DCMAKE_SKIP_BUILD_RPATH=ON"
   ];
 
+  postInstall = ''
+    mkdir -p $out/nix-support/bin
+    makeWrapper $out/bin/zig $out/nix-support/bin/zig \
+      --append-flags -Dcpu=baseline
+  '';
+
   doCheck = true;
   installCheckPhase = ''
     $out/bin/zig test --cache-dir "$TMPDIR" -I $src/test $src/test/behavior.zig
   '';
 
+  setupHook = ./setup-hook.sh;
+
   meta = with lib; {
     homepage = "https://ziglang.org/";
     description =
--- /dev/null
+++ b/pkgs/development/compilers/zig/setup-hook.sh
@@ -0,0 +1,5 @@
+zigWrapperHook() {
+    export PATH=@out@/nix-support/bin:$PATH
+}
+
+postUnpackHooks+=(zigWrapperHook)

@winterqt
Copy link
Member

winterqt commented Feb 8, 2023

As an alternative to that, we could also make a Zig wrapper a la cc-wrapper that sets flags based on commands/environment variables, and I'd be happy to work on this if you (@strager) don't mind.

@edrex
Copy link
Contributor

edrex commented Feb 9, 2023

#214356 (comment)

a patch that passes -DZIG_TARGET_MCPU=baseline when building the zig compiler would fix the originally reported immediate issue

It seems like we should do as @ifreund recommends, a one flag patch, and allow the zig builder story to bake awhile.

@winterqt
Copy link
Member

winterqt commented Feb 9, 2023

Agreed.

I can confirm that the following patch builds on both aarch64-linux and aarch64-darwin. @strager if you wouldn't mind updating this branch to this in the meantime, so we can fix the big issue at hand, that'd be great. (Feel free to update the comment if you'd like.) Thanks!

diff --git a/pkgs/development/compilers/zig/0.10.nix b/pkgs/development/compilers/zig/0.10.nix
index 89f23b9ca25..966be329bef 100644
--- a/pkgs/development/compilers/zig/0.10.nix
+++ b/pkgs/development/compilers/zig/0.10.nix
@@ -47,6 +47,9 @@ stdenv.mkDerivation rec {
   cmakeFlags = [
     # file RPATH_CHANGE could not write new RPATH
     "-DCMAKE_SKIP_BUILD_RPATH=ON"
+
+    # ensure determinism in the compiler build
+    "-DZIG_TARGET_MCPU=baseline"
   ];
 
   doCheck = true;
diff --git a/pkgs/development/compilers/zig/0.9.1.nix b/pkgs/development/compilers/zig/0.9.1.nix
index e7c62a4cf93..637186f686e 100644
--- a/pkgs/development/compilers/zig/0.9.1.nix
+++ b/pkgs/development/compilers/zig/0.9.1.nix
@@ -62,6 +62,9 @@ stdenv.mkDerivation rec {
   cmakeFlags = [
     # file RPATH_CHANGE could not write new RPATH
     "-DCMAKE_SKIP_BUILD_RPATH=ON"
+
+    # ensure determinism in the compiler build
+    "-DZIG_TARGET_MCPU=baseline"
   ];
 
   doCheck = true;

@AndersonTorres
Copy link
Member

Is it ready to merge? All tests are passing.

Is it ready to merge? All tests are passing.

@strager
Copy link
Contributor Author

strager commented Feb 17, 2023

It seems that people prefer #215994 and per-package workarounds instead of my fix-everything solution. I didn't test that patch, but it seems like it should do the right thing.

@strager strager closed this Feb 17, 2023
@winterqt
Copy link
Member

Hi @strager, please see #214440 (comment) and the surrounding discussion for ideas with regards to the problem of what the compiler outputs. Do you have any thoughts on the matter?

(Lastly, I was waiting until some time passed before submitting my own PR with just the Zig compiler fix as a courtesy towards you for starting this work, but I assume the author of the other PR didn't see that there was a discussion here -- sorry about that.)

@adamcstephens
Copy link
Contributor

I did not see this PR, sorry. I do think it would be beneficial to fix this globally rather than require every individual package to do so.

@adamcstephens
Copy link
Contributor

@strager Are you open to resurrecting this? I just reviewed another new zig package and would prefer your global implementation to setting baseline on all packages. (Which surely misses some)

@sagehane
Copy link
Contributor

sagehane commented Mar 23, 2023

We should be setting the baseline for all Nixpkg packages but we should do it through a different approach.
The current approach also affects packages built by the end users, when it should ideally only affect packages built by the system itself.

The following are relevant:
#214440 (comment)
ziglang/zig#14281

Edit: I think the approach that Ekaitz is taking for Guix is a good role model.
https://issues.guix.gnu.org/60889 (especially the section on cross compilation)

@strager
Copy link
Contributor Author

strager commented Mar 24, 2023

[I] would prefer your global implementation to setting baseline on all packages. (Which surely misses some)

I agree. I think that Nix-built tools should behave the same whether you are building another Nix package or using the tool outside of Nix, and that behavior should be "it works".

Other people, such as @sagehane and Zig's lead maintainer, disagree and think that the "same" tool should behave in two different ways depending on how it's invoked. I'm not interested fighting this fight though.

@sagehane
Copy link
Contributor

tl;dr of what I'm trying to convey: How does modifying the behaviour of Zig aid the users to have consistent/predictable behaviour outside of the Nix build system? I would think it does the opposite.

I think that Nix-built tools should behave the same whether you are building another Nix package or using the tool outside of Nix, and that behavior should be "it works".

  1. Every use-case has different needs. NixOS cares about reproducibility and end users might care more about other things, like optimising for their local machine or using a different baseline when sharing binaries with computers that are guaranteed to support x86_64-v2+, etc.
  2. If anything, altering Zig's behaviour is far from "making it work". It's essentially a foot-gun for people who expect the behaviour to resemble that of upstream's.

Other people ... think that the "same" tool should behave in two different ways depending on how it's invoked.

Well yeah, is that strange? Isn't that how other compilers/build systems packaged by Nixpkgs work too? Or does something like GCC/make enforce compiling for a baseline when used for userspace too?

@strager
Copy link
Contributor Author

strager commented Apr 26, 2023

How does modifying the behaviour of Zig aid the users to have consistent/predictable behaviour outside of the Nix build system? I would think it does the opposite.

I agree that my proposal does not work toward that goal. I don't think that's a goal of Nix or Nixpkgs, but I could be wrong.

does something like GCC/make enforce compiling for a baseline when used for userspace too?

I am not advocating for "enforcing" compiling for the baseline. I am advocating for a compiling for the baseline by default.

I think both GCC and Clang compile for the baseline by default at least on Linux x86_64 (both upstream and in Nix).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants